Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating HL compile and export to infer APIs #214

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

asmigosw
Copy link
Contributor

@asmigosw asmigosw commented Jan 8, 2025

Migrating HL compile API and export API to infer APIs

@quic-amitraj
Copy link
Contributor

Please rebase

Change-Id: If27fbc1636ed1fe9b475d07cef7c83ed7dc46ca8
Signed-off-by: Asmita Goswami <[email protected]>
) # type: ignore
logger.info(f"Generated onnx_path: {onnx_model_path}, onnx_dir_path: {onnx_dir_path}")
logger.info(f"Exporting Pytorch {model_name} model to ONNX...")
qeff_model = QEFFAutoModelForCausalLM.from_pretrained(model_name, cache_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not restrict CLI apis to use only AutoModelForCausalLM. It should be generic as we support new auto classes.

@quic-amitraj quic-amitraj marked this pull request as draft January 16, 2025 10:36
@ochougul ochougul marked this pull request as ready for review January 29, 2025 03:42
hf_token: Optional[str] = None,
local_model_dir: Optional[str] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you removing this?

@@ -92,7 +76,6 @@ def main(
model_name=model_name,
cache_dir=cache_dir,
hf_token=hf_token,
local_model_dir=local_model_dir,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment on lines 107 to 115
config = AutoConfig.from_pretrained(model_name)
architecture = config.architectures[0] if config.architectures else None

model_class = architecture_mapping.get(architecture)
if not model_class:
logger.error(f"Model class for model name {model_name} not found in mapping")
return

qeff_model = model_class.from_pretrained(model_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? directly useQEFFAutoModelForCausalLM.from_pretrained

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of writing own dictionary please make use of
MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES , MODEL_FOR_CAUSAL_LM_MAPPING_NAMES from transformers.

from transformers.models.auto.modeling_auto import MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES, MODEL_FOR_CAUSAL_LM_MAPPING_NAMES 

## Now check if the architecture is present in either of the values of these two dictionaries and call our corresponding auto class based on that.

Comment on lines 19 to 32
# Map model's architecture to class
architecture_mapping = {
"LlamaForCausalLM": QEFFAutoModelForCausalLM,
"GPT2LMHeadModel": QEFFAutoModelForCausalLM,
"MistralForCausalLM": QEFFAutoModelForCausalLM,
"FalconForCausalLM": QEFFAutoModelForCausalLM,
"GPTJForCausalLM": QEFFAutoModelForCausalLM,
"GemmaForCausalLM": QEFFAutoModelForCausalLM,
"Gemma2ForCausalLM": QEFFAutoModelForCausalLM,
"Phi3ForCausalLM": QEFFAutoModelForCausalLM,
"Qwen2ForCausalLM": QEFFAutoModelForCausalLM,
"GPTBigCodeForCausalLM": QEFFAutoModelForCausalLM,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Signed-off-by: Asmita Goswami <[email protected]>

from QEfficient.exporter.export_hf_to_cloud_ai_100 import qualcomm_efficient_converter
from QEfficient.utils import check_and_assign_cache_dir, onnx_exists
from QEfficient.transformers.models.modeling_auto import QEFFAutoModelForCausalLM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from QEfficient import QEFFAutoModelForCausalLM it's present in __init__

full_batch_size=full_batch_size,
) # type: ignore
logger.info(f"Generated onnx_path: {onnx_model_path}, onnx_dir_path: {onnx_dir_path}")
logger.error(f"Model class for model name {model_name} not found in mapping")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`raise NotImplementedError(f"Unknown architecture={architecture}, either use specific auto model class for loading the model or raise an issue for support!")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fail here which will force the script to exit

) # type: ignore
logger.info(f"Generated onnx_path: {onnx_model_path}, onnx_dir_path: {onnx_dir_path}")
logger.error(f"Model class for model name {model_name} not found in mapping")
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

enable_qnn=enable_qnn,
qnn_config=qnn_config,
)
logger.error(f"Model class for model name {model_name} not found in mapping")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same raise error

Comment on lines 94 to 108
config = AutoConfig.from_pretrained(model_name)
architecture = config.architectures[0] if config.architectures else None

if architecture in MODEL_FOR_CAUSAL_LM_MAPPING_NAMES.values():
model_class = QEFFAutoModelForCausalLM
else:
# Handle onnx model generation
onnx_model_path = get_onnx_model_path(
model_name, cache_dir, tokenizer, hf_token, local_model_dir, full_batch_size
) # , base_dir_name)

#########
# Compile
#########
_ = QEfficient.compile(
onnx_path=onnx_model_path,
qpc_path=os.path.dirname(
qpc_dir_path
), # We need to pass parent directory of qpc_dir_path, as the compile function handles the qpcs directory creation
num_cores=num_cores,
batch_size=batch_size,
prompt_len=prompt_len,
ctx_len=ctx_len,
mxfp6=mxfp6,
mxint8=mxint8,
aic_enable_depth_first=aic_enable_depth_first,
mos=mos,
device_group=device_group,
full_batch_size=full_batch_size,
allow_mxint8_mdp_io=allow_mxint8_mdp_io,
enable_qnn=enable_qnn,
qnn_config=qnn_config,
)
logger.error(f"Model class for model name {model_name} not found in mapping")
return

qeff_model = model_class.from_pretrained(
pretrained_model_name_or_path=(local_model_dir if local_model_dir else model_name),
cache_dir=cache_dir,
hf_token=hf_token,
full_batch_size=full_batch_size,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code is a copy of the same in export method you can create a common method in utils file in cloud folder and use from there. you can call it.
load_qeff_model

asmigosw and others added 3 commits January 29, 2025 11:20
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Onkar Chougule <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants